Genetic algorithm with a new round-robin based tournament selection: Statistical properties analysis

A round-robin tournament is a contest where each and every player plays with all the other players. In this study, we propose a round-robin based tournament selection operator for the genetic algorithms (GAs). At first, we divide the whole population into two equal and disjoint groups, then each individual of a group competes with all the individuals of other group. Statistical experimental results reveal that the devised selection operator has a relatively better selection pressure along with a minimal loss of population diversity. For the consisting of assigned probability distribution with sampling algorithms, we employ the Pearson’s chi-square and the empirical distribution function as goodness of fit tests for the analysis of statistical properties analysis. At the cost of a nominal increase of the complexity as compared to conventional selection approaches, it has improved the sampling accuracy. Finally, for the global performance, we considered the traveling salesman problem to measure the efficiency of the newly developed selection scheme with respect to other competing selection operators and observed an improved performance.


Abstract:
A round-robin tournament is a contest where each and every player plays with all the other players.
In this study, we propose a round-robin based tournament selection operator for the genetic algorithms (GAs). At first, we divide the whole population into two equal and disjoint groups, then each individual of a group competes with all the individuals of other group. Statistical experimental results reveal that the devised selection operator has a relatively better selection pressure along with a minimal loss of population diversity. For the consisting of assigned probability distribution with sampling algorithms, we employ the Pearson's chi-square and the empirical distribution function as goodness of fit tests for the analysis of statistical properties analysis. At the cost of a nominal increase of the complexity as compared to conventional selection approaches, it has improved the sampling accuracy. Finally, for the global performance, we considered the traveling salesman problem to measure the efficiency of the newly developed selection scheme with respect to other competing selection operators and observed an improved performance. examples.
This statement is required for submission and will appear in the published article if the submission is accepted. Please make sure it is accurate.
Unfunded studies Enter: The author(s) received no specific funding for this work.

Introduction
Genetic algorithms (GAs) are stochastic approaches for optimization, based on natural mechanisms of genetics. These algorithms refer to the natural selection process where the most fitted individuals for reproduction. Generally, five stages are considered in GA: initial population, fitness function, selection, crossover and mutation operators. If parents have good fitness, their offspring will be better than them and have a better chance to survive. The process continues and eventually a generation is found with the most qualified individuals.
The development of GAs originates from the influential work of Holland (1975). Many scholars have acknowledged GA as it is considered a key member of optimization related research. The global environment, robustness actions and reliability are the main reasons for its popularity. For example, Ruxton and Beauchamp (2008) highlighted the applicability of GAs in the field of behavioral ecology to discover the vigilance behavior in animals. Further, Song et al. (2016) employed GAs to achieve optimal satellite selection for global positioning system (GPS) use. In recent times, Soui et al. (2019) explored GAs in modeling of membership behavior in the stock market and to increase the scalability in credit risk assessment. Other than these, many applications of GAs can be perceived in the multidisciplinary research literature, such as, interlocking medicine (Forrest et al., 1993) and artificial intelligence (Rubio et al., 2016). The ability to solve complex multidimensional and multi-models optimization problems with the minimum information required about their objective functions, is also a reason of its popularity, see, for example, Beyer (1997) and Amsa et al. (2013). Someone can consult to Pandey et al. (2014) for a more detailed account of promising features and a comprehensive overview of GA.
In all domains of life, GAs are found to be effective but still there is the issue of premature convergence in pursuit for optimal solutions, see, for example, Fogel (1994) and Hussain and Muhammad (2020). The complications of premature convergence are entrenched in the philosophical orientation of GAs as shortened by Julstrom (1999). Maintaining good diversity in the population is required to GA success. Otherwise, this leads to stuck off on local optima, which is an undesirable situation in GA, called premature convergence. The optimization literature acknowledging the diversity of population is a vital factor in search of a global optimal solution. This is evident by the discussion on the relevance of premature convergence and population diversity, see, for example, Burke (2004); Koumousis and Katsaras (2006); Pandey et al. (2014) and Aibinu et al. (2016). So it is clear from these studies that the performance of GA is mostly affected by the choice of selection operator. The selection operator is the most crucial research area in the body of contributions associated with GAs.
Due to the importance of the selection phase in GAs, the current research contributes to the literature by introducing a novel selection operator, namely the round-robin based tournament selection (RRTS). The main focus of this research is on facilitating the convergence process by maintaining a desirable level of population diversity. This objective is achieved by achieving a tradeoff between exploration and exploitation. The fitness rank of participants in concordance with the normality of generations is used to aid the selection process and the encouraging results of this delicate selection scheme are documented in this article.
The remainder part of this paper is organized as follows. In Section 2, the selection operator as a two-stage procedure, with a detailed review, has been discussed. In Section 3, we propose a new selection operator with its theoretical and mathematical foundations. Further, several stochastic properties of the newly proposed operator are reviewed in Section 4. Inspired by the stochastic features, Section 5 delineates the applicability of the proposed methodology in solving one of the practical problems, i.e. the traveling salesman problem (TSP). Lastly, Section 6 summarizes the study along with a brief discussion of future perspective research.

Selection procedure
The selection process in GA can be split into two stages. In the first stage, a selection probability is assigned to each and every individual based on fitness values. These probabilities are denoted as: To investigation about selection probabilities, someone can consult to Julstrom (1999) and Hussain and Muhammad (2020). The second stage is the sampling process, which selects the most fitted parents (based on Darwin's "survival-of-the-fittest" criterion) from the current population for mating process. A thoroughly discussion about sampling algorithms has been provided in Section 4. This study, additionally, has a significant effect on the GA's selection methods. In this perspective, a new operator for the selection is projected that is expected to reinforce the typical character of the population and offers an improved tradeoff between exploitation and exploration.

Assignment of probability
The first and the most popular selection procedure known as fitness proportional selection (FPS), was proposed by Holland (1975). In this selection procedure, the selection probability of i th individual, say p i , is directly proportional to its fitness. The theme of this method depends upon the understanding that fitter individual ought to have a high probability of selection, whereas each individual to become the member of the parent population using the following formula: where, f i denotes the fitness status of i th individual. The operational directives of FPS are similar to the probability proportional to size (PPS) sampling using with replacement approach. Throughout the entire selection process, there will be no alternation needed in size and possibilities. This method is easy to enforce and offers probabilities to all individuals according to their fitness values, but the scaling problem is its main drawback, see, for instance, Grefenstette (1986).
The linear rank selection (LRS) was introduced by Baker (1985), is catered as the remedy of premature convergence attributed with FPS. This method provides a relatively better opportunity to pick out weaker individuals and therefore offered a smoother selection function. In the LRS procedure, the i th individual is assigned a selection probability using the following formula: where, i is the rank of the individual based on fitness status and ϑ − and ϑ + are the parameters for the selection probabilities of worst and best individuals based on their ranks, respectively. The two constraints which are associated with this scheme as: ϑ + + ϑ − = 2 and ϑ − ≥ 0. As a result, even if individuals differ notably in fitness status, the ranks remain work in uniform pattern, unable to reflect the difference with desirable intensity and so naturally compromise relevant data. Along with numerous applications of LRS, as an example, see, Sharma and Mehta (2013), but on the other hand, it has a drawback of slower convergence of the algorithm. This is because of its internal methodology based on ranks instead of fitness values directly for selection, see, Pandey et al. (2014);Aibinu et al. (2016) and Hussain and Muhammad (2020). This issue becomes more serious in the case of a larger population where ranks are thought about as a realization from the uniform distribution. To resolve the difficulty of LRS, Michalewicz (1992) designed an alternative rank based selection operator, called exponential rank selection (ERS). To tell apart from LRS, Michalewicz (1992) prompt that the selection probabilities increase exponentially from worst individual to best one. The selection venue of individual medaled with i th rank, is mathematically written as: where, 0 < ν < 1, ν is a fixed ratio defined as weights of individuals based on their fitness ranks and maximum gain values of ν closer to unity (i.e. ν → 1) is recommended by Michalewicz (1992). ERS as a popular selection method is evidenced by various applications, see, for example, Schell and Wegenkittl (2001) and Lee et al. (2008). Another selection procedure, which is based on a real phenomenon of the tournament, known as binary tournament selection (BTS), was introduced by Back (1996). Using BTS, two competitors are randomly chosen, then a winner will be selected for mating process. In this case, the chance of choosing a good parent is very high, but if both of the selected parents are of low quality, the low quality parents will be selected. Spotting the significance of population diversity, Back (1996) insisted lower tournament size because pair wise comparisons remain the most common theme in tournament selection schemes. The selection probability of i th ordered individual is given as: where, r represents the array of tournament size. Julstrom (1999) employed a probability-based threshold level to select the winner of the tournament called probabilistic 2-tournament selection (PTS). In this scheme, the competition winner will be survive with a probability 0.5 < q < 1, where the loser will get another chance of competing, with probability 1 − q. For the PTS method, the i th ordered individual is assigned the selection probability by the following rule: This selection procedure has great applicability, see, for instance, Schell and Wegenkittl (2001); Lee et al. (2008) and Hussain and Muhammad (2020). The most significant feature of the selection operator is the selection pressure because it is adjacent with a suitable balance between exploration and exploitation. Eiben et al. (2006) described a scenario, where a relatively lower selection pressure is required at the initial stage for diversity in the whole sampled population and enhance at the last stage to assist the convergence of algorithm. To tradeoff between the two extremes, an adjustable selection pressure should be required, see, for example, Pham and Castellani (2010).
This article proposes a new selection approach, which removes the weakness related to fitness based (i.e. FPS), rank based (i.e. LRS and ERS) and tournament based (i.e. BTS and PTS) approaches. It is predicted on the basis of a tournament scheme, whereas we split all the individuals into two equal and mutually exclusive groups and assign them probabilities for selection according to their ranks. The details that how an individual is competing with other group's members to be survived as parent for mating process are provided in the adjacent section.

Motivation
There are many selection mechanisms that have been proposed in the literature. As LRS emphasized maintaining higher levels of population diversity at the cost of selection pressure and results in slowest convergence of GAs. On the other hand, the FPS method has high selection pressure with sacrificing the diversity and as a result remains the prime candidate of suffering from premature convergence. In this section, a new operator capable of achieving more balance between exploration and exploitation is proposed, which provides sufficient selection pressure throughout the selection process.

Round-robin based tournament selection
An alternative selection scheme (round-robin based tournament selection (RRTS)) is proposed to maintain a precise balance between exploration and exploitation. In this approach, an adequate selection pressure with elimination of the fitness scaling problem is provided. Consider the following steps for the proposed selection procedure: (i) In the RRTS method, all individuals are ranked according to their fitness measures and acquire a distinct rank even though they have equivalent fitness values.
(ii) The individuals are divided into two equal and disjoint groups, e.g. A and its complement A c . The population can be inserted in these groups in multiple ways, such as: randomly, first half is in one group and rest is in other (best-worst), the odd individuals are in one group and even in the other (even-odd), up to 25% and 51% to 75% in one group and rest in other etc.
(iii) Now, one individual, i.e. i, is chosen at random from a group with a surviving process probability θ, also comparing the combined effect of all the other group members with probability (1 − θ). Thus, the selection probability to select an individual as a parent is determined by the subsequent rule: where if i belongs to one group then j belongs to other group and K is the population size. Table 1 presents some rules to assign probabilities to all the individuals for K = 10 and θ = 0.5. There are two tuning parameters to maintain a tradeoff between diversity and selection pressure in our proposed method, i.e. the value of θ and group segmentation.

The sampling algorithms
The first stage in the selection phase is to assign probabilities to all competing individuals, whereas in the second stage, a sampling algorithm is requisite to fill the mating pool for parents, whereas this process reflects the selection probabilities, such that the expected and observed number of individuals are equal. In this study, two popular sampling methods, roulette wheel sampling (RWS) and stochastic universal samplings (SUS) are used for testing.

Roulette wheel sampling
The RWS was introduced by Holland and it is still one of the most popular sampling methods for GA. In the RWS procedure, each possible solution is appointed as a slice with respect to a portion, which is assigned by a desired selection probability method. By using a single marker at the border of the roulette wheel and the roulette wheel is spun K times to successively select individuals. This sampling method is very simple and easier to implement with a high probability for the better choice of chromosomes. Clearly, the vector (o 1 , o 2 , ..., o K ) follows a multinomial distribution with parameters K and P , where P = (p 1 , ..., p K ). The mean and variance of this distribution are given below:

Stochastic universal sampling
The mechanism of SUS was introduced by Baker and is quite similar to RWS, however instead of the number of markers. In this method, K markers spaced evenly are used at the border of the roulette wheel. The slices of wheel are consistent as in RWS. In the SUS method, the roulette wheel is spun one time only and select all individuals, which are pointed by the K markers and enclosed in the mating pool as parents.

(SUS)
Table 2. The overall expected counts, ξ j , with respect to their classes, C j (j = 1, 2, ..., 10). Therefore, all the parents are chosen in just one cycle of the wheel and this method promotes the better individuals for selecting at least once. The computational complexity of SUS (i.e. O(N )) is comparatively lower than the complexity of RWS (i.e. O(N 2 )), to identify the selected candidates as parents, only one pass over the population is needed. However, their expectations close with each other but the variabilities of the most fitted individuals are significantly least than RWS.
A detailed comparison between two sampling methods, i.e. RWS and SUS can be founded in the central moments of the distributions of the vectors (o 1 , o 2 , ..., o K ). The absolute difference between an individual's observed and its expected values is defined as bias, i.e. |o i − e i |. In sampling, each individual might be provided a certain number of copies that are placed into the mating pool. The possible range of the number of copies is called "spread". The SUS provides the insurance of minimum spread and almost zero bias.

The chi-square test as a goodness-of-fit measure
For empirical analysis, the chi-square test is a measure to ascertain the accuracy of sampling algorithms, i.e. RWS and SUS, and will compare with the probability distribution of the selection operators. As a tool for measuring the expected accuracy, the χ 2 test was first introduced by Schell and Wegenkittl (2001).
Let we consider, ξ j = i∈C j e i is an overall expectation, whereas O j = i∈C j o i be the observed (actual) copies of individuals in mating pool after the sampling procedure. The two disjoint classes are: {C 1 , C 2 , ..., C c }, C j ⊂ {1, 2, ..., K} and ∪ c j=1 C j = {1, 2, ..., K}. For expected behavior, the ξ j be of the enjoin K/c members with 1 ≤ j ≤ c, to regulate that each class maintains the same number of individuals (on average). To desired stochastic accuracy, at least 10 individuals should be in each class. The chi-square test is defined as: In the RWS algorithm, the ξ j ≥ 10 as it minimizes the differences between expected and observed frequencies. On the other hand, for SUS, we expect that χ ≈ 0. In Table 2, the probability distributions of all competing selection operators with the corresponding overall expected individuals (i.e. close to 300/10) are presented. χ S,R is the measure of chi-square for operator S that assigns the probabilities to individuals and for sampling algorithm R.
The basic purpose of this test is to describe the sample mean and sample variance. The initial sampled population is to be considered as randomly. The probability distribution S is used to assign selection probabilities to all the individuals and then one of two sampling schemes, i.e. R is utilized to obtain instances of o i , O j and χ S,R , respectively. By the succession of (χ S,R k ) 1≤k≤s , the sample mean and variance can be computed as:μ For 99% confidence interval, it is compared with theoretical χ 2 c−1 distribution. The sample mean and variance of chi-square should be close to c − 1 = 9, 2(c − 1) = 18, respectively, and their estimates ofμ and σ 2 are provided in Table 3. The SUS results are also reported in this table, which are shown its sampling accuracy as well. The average accuracy of the sampling method with all competing selection schemes is observed from these empirical results.

Empirical distribution function analysis
In this section, the empirical distribution function (EDF) is compared with theoretical chi-square distribution χ 2 c−1 of roulette wheel sampling and it is given as: In Figure 1, the behaviors of EDF (dashed line), for various selection operators for a population size K = 300 with a similar number of tests are reported. The selection operators are being compared with the theoretical χ 2 c−1 distribution (dark thick line) using a 99% confidence band under the hypothesis of RWS (dashed thin double line). Here, the range on the x-axis values is t ∈ [0, 18], where we expect the value of χ S,R k , i.e. E[χ S,R k ] = c − 1 = 9. The RWS provides the empirical distribution function that is insignificant from the theoretical χ 2 c−1 distribution byê andσ 2 statistics.
5 Global performance

The traveling salesman problem
The most Illustrious benchmark, noteworthy and historic hard combinatorial optimization problem is the traveling salesman problem (TSP). In this problem, someone wants to find out the shortest Hamiltonian tour to starts his/her tour from a city and go to all other cities once and come back to the initial city. The first one, who documented this problem was Euler in 1759, see, for example, Larranaga et al. (1999). It is the most fundamental problem and has many applications in engineering, discrete mathematics, operations research, graph theory and computer science, etc. Let n cities with a distance (cost) matrix, C = [c ij ] n×n is searched for a permutation λ : {0, ..., n−1} −→ {0, ..., n − 1}, where c ij is the distance between city i and city j and it minimizes the traveled distance f (λ, C) as follows: where λ(i) is the location of city i in each tour, d(c i , c j ) is the distance from a city i to another city j, whereas (x i , x j ) be a specified position of each city in a tour in the plane, and the Euclidean distances of the distance matrix C between the city i and city j is extracted in the following way: TSP is easy to understand but very difficult to solve, e.g. with '100' cities, there are 10 155 possibilities to find the tour. This is the main reason to declare it as a non-deterministic polynomial (NP-hard) problem, see, for example, Helsgaun (2000) and Applegate et al. (2006). Hence, this type of problem is not possible to solve using traditional optimization algorithms, e.g. gradient-based methods. To attain the optimal or close to optimal solution within an adequate amount of time, the heuristic algorithms are better choices to manage the NP-hard problems (Huang et al., 2015;Ruiz et al., 2015;Hussain and Muhammad, 2020). The GA has also been applied for the solution of this problem in different ways, see, for example, Potvin (1996); Larranaga et al. (1999); Moon et al. (2002); Nagata and Soler (2012) and Hussain et al. (2017). Some test problems are taken from the library of traveling salesman problem (TSPLIB) for the global performance of the newly devised selection operator with respect to existing ones and reported in Table 4.

The state-of-the-art settings
In the simulation study, all GA programs were coded in MATLAB software. Moreover, the two stopping criteria, i.e. not improvement found in 300 successive generations and the maximum number of generations (i.e. 5000) are used. The order crossover (OX) along with a well-known exchange mutation (EM) operator are used in this study. Table 5 is provided for further information about desired parameters.

Simulation results and discussion
In the above sections, the relative characteristics of the proposed operator with its competing selection methods with respect to sampling accuracy and population diversity have been determined. In this section, we test the performance of RRTS with other schemes by applying it to TSP. The results of six competing selection schemes with the most popular genetic operators, i.e. order crossover (OX) and exchange mutation (EM) are provided in Table 6, where all the tests are repeated thirty times. On the Table 6. Results of various selection methods with respect to OX (crossover) and EM (mutation) operators. basis of average, standard deviation (S.D) and relative efficiency (R.E), these computational results are compared. Since TSP is a minimization problem, we observed an improved performance, based on 5000 simulations, by the proposed operator from among all six competing selection operators. From these results, we can confirm that RRTS outperforms the others.

Conclusions
For every optimization algorithm, the main desire is to balance between two extremes, i.e. exploration and exploitation. This article presents a new round-robin based tournament selection operator for GAs, which is suggested a fine balance between exploitation and exploration. The individuals are sorted with respect to their fitness measures and then the whole population is divided into two equal and non-overlapping groups, i.e. A and A c . To determine the sampling accuracy, we employ χ 2 test to confirm a close match between the expected and observed number of offspring (insignificant difference). A simulation study is performed to evaluate the performance of the newly devised selection operator along with some conventional operators. Based on this research, we suggest that the proposed operator might be used as a better alternative to get global optima or near to optimum results. Moreover, researchers might be apply it for any problems related to evolutionary algorithms.

Disclosure statement
No conflict of interest is declared by author(s).